Visual and geolocation analytic system and method

ABSTRACT

A visual and geolocation analytic system is provided, including: an analytic device and a number of image capturing devices connected to said analytic device. The image capturing devices capture images of an object at a time interval and send said captured images to said analytic device; said analytic device comprises a deep learning model for analyzing said captured images, allowing said object to be identified and tagged, and allowing a path of movement of said object across time to be tracked. The present invention tracks the position of an object within an area continuously across time, and transform the object in captured images into structured data set for analysis.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims priority from US provisional application62/622,145 filed on 26 Jan. 2018, the entirety of which is incorporatedherein by reference.

FIELD OF INVENTION

The present application is related to a visual and geolocation analyticsystem, and in particular a visual and geolocation analytic system usingedge computing technology.

BACKGROUND OF INVENTION

In the information driven society today, any kind of information can bevaluable. For example, for a retail store, customer information such asfoot traffic or consumer demographics is very important for the ownersto provide the best experience and products to customers. Currently someof the researches are done manually, and hence is costly, timeconsuming, ineffective and inconsistent to provide useful insights.

SUMMARY OF INVENTION

In forelight of the background, an advanced visual and geolocationanalytic system is needed for more efficient and convenient analysis. Akey feature is to transform an object in captured images into structureddata set for analysis. Data can be collected locally and synchronizedwith other devices.

In one embodiment of the present invention, a visual and geolocationanalytic system is provided, including: an analytic device and a numberof image capturing devices connected to said analytic device. The imagecapturing devices capture images of an object at a time interval andsend said captured images to said analytic device; said analytic devicecomprises a deep learning model for analyzing said captured images,allowing said object to be identified and tagged, and allowing a path ofmovement of said object across time to be tracked.

In a preferred embodiment, deep learning model of said analytic devicecomprises a facial recognition system for recognizing a person. In afurther embodiment, the facial recognition system further determinesdemographic information of said person. In another preferred embodiment,the deep learning model of said analytic device comprises an objectdetection system for recognizing a stock keeping unit of a product item.

In a preferred embodiment, deep learning model is trained with imageclassification using convolutional neural networks and fine tuned withbounding boxes and image distortions, so that it is capable of objectdetection by recognizing distinct attributes within images and to tagpositional information on said object.

In a preferred embodiment, the system further comprises a transactiondevice linked to said analytic device, wherein said transaction devicesends a transaction record involving said object to said analyticdevice, said analytic device integrates said transaction record intosaid path of movement of said object.

In a preferred embodiment, the system further comprises at least oneweight sensor linked to said analytic device, wherein said weight sensorsends a signal to said analytic device when a weight change is detectedat a sensed location.

In a preferred embodiment, the captured images are discarded from saidanalytic device after analyzing, and only a structured data setcontaining an identity and said path of movement of said object acrosstime is retained. In a preferred embodiment, the analytic device issynchronized with analytic devices of other systems to form adistributed edge network.

In another aspect of the invention, a method for visual and geolocationanalysis is provided. The method comprises the steps of: capturingimages of an object at a time interval with a number of image capturingdevices; recognizing and tagging said object from said captured imagesby an analytic device having a deep learning model; extracting time andlocation data of said object from said captured images, and tracking apath of movement of said object across time.

The present invention provides significant advantages over existingsystems. For example, the present invention tracks the position of aperson within an area continuously across time. The path of movement ofthe person is tracked and any product items that the person haspurchased is also tracked. The statistics can be analyzed and layout ofthe area or the product lineup can be optimized. Demographic informationof the person can also be determined, and targeted information can beprovided to the person when the person is within or proximate the area.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram of a visual and geolocation tracking systemaccording to an embodiment of the present invention.

FIG. 2 shows an example setting of using the visual and geolocationtracking system of the present invention.

FIG. 3 shows an example flow of how the visual and geolocation trackingsystem of the present invention tracks a person and an object.

DETAILED DESCRIPTION OF INVENTION

FIG. 1 shows a first embodiment of the present invention. The inventioncomprises an analytic device 20, and a number of image capturing devices22 connected to the analytic device 20. The analytic device 20 comprisesa deep learning model 24 having a graphics processor unit, and a localdata storage 26. The analytic device 20 also comprises communicationmeans such as wired or wireless communication modules for connecting toexternal devices.

In an example setting as shown in FIG. 2, said image capturing devices22 are installed at fixed locations of a predetermined area, such asexits 36 and entrances 34, corners or along a path, or locations ofimportance such as product shelf 32. Preferably, the image capturingdevices 22 are installed to cover a continuous area, and more preferablyat least two image capturing devices 22 simultaneously cover an area atdifferent angles for more accurate data collection. In a preferredembodiment, the image capturing devices 22 are installed vertically withthe lens facing directly downwards, but they can also be installed atany tilt angle. Although FIG. 2 shows the setting as a room bound bywalls, the system can also be installed in open areas, where theboundaries can be arbitrarily defined by the user depending on the areaof coverage of image capturing devices 22. The analytic device 20 doesnot need to be exposed and can be disposed at any location so long as itis connected to the image capturing devices 22. The analytic device 20can also be integrated to the image capturing device 22.

In a preferred embodiment, the image capturing devices 22 are poweredand connected to the analytic device 20 through PoE (Power overEthernet) technology. In another embodiment, power supply andconnectivity of the image capturing devices 22 is separately provided.The image capturing devices 22 can be connected to the analytic device20 through wired or wireless means.

In an example operation as shown in FIG. 3, a person 28 enters the roomat an entrance 34, picks up a product item 30 from the product shelf 32,and leaves the room at an exit 36. When the person 28 enters the room,the image capturing devices 22 captures an image including the person 28and sends to the analytic device 20. The deep learning model 24 of theanalytic device 20 analyzes the captured image and determines that theperson 28 is a unique object X. The analytic device 20 then tags theperson 28 and generates a structured data set for the person 28including time and location data of the person 28 based on the capturedimage, and stores the structured data set in the local data storage 26.The location data is determined by mapping the space in the area ofdetection through calculated placement of the image capturing devices22, and extracted based on the coordinates of the person 28 in theimages captured. The location data can be in two dimensions or threedimensions.

In a preferred embodiment, the deep learning model 24 is trained withimage classification using convolutional neural networks. The deeplearning model 24 is fine tuned with bounding boxes and imagedistortions so that it is capable of object detection by recognizingdistinct attributes within images, and to tag positional information onthe object. The deep learning model 24 is trained to run on lowcomputational memory usage.

While the person 28 moves along the room, the image capturing devices 22continue to capture images of the person 28. In a preferred embodiment,each image capturing device 22 captures an image at a time interval,such as every six seconds. The time interval can be predetermined oradjusted in real time based on number of targets or size of areacovered. Different image capturing devices 22 may have differing timeintervals and may not capture image at the exact moments. The analyticdevice 20, after analyzing the captured images, recognizes the person 28in the captured images to match the previously identified person 28, andappends the updated time and location data to the existing structureddata set. By combining the data analyzed from images captured atdifferent times, a path of movement of the person 28 across time can betracked and recorded.

When the person 28 arrives at the product shelf 32 and picks up aproduct item 30, the analytic device 20 is informed of the action basedon at least one way. In one embodiment, a weight sensor is disposed atthe product shelf 32, such as below the rack of product item 30. Whenthe product item 30 is picked up, the weight sensor detects a change ata sensed location. The change is for example a weight reduction due tothe product item 30 being displaced from the sensed location. The weightsensor is linked to the analytic device 20 and this change is sent tothe analytic device 20. By determining the location of the weightsensor, the analytic device 20 knows which product item 30 is beingpicked up. In another embodiment, the analytic device 20 with the deeplearning model 24 recognizes a stock keeping unit (SKU) of the productitem 30, and the paths of movement of both the person 28 and the productitem 30 are tracked and recorded. Both methods can be employed at thesame time as each method has its advantages, for example weight sensoris more reliable in a crowded area where image capturing devices 22 maybe blocked from taking a clear image of the product item 30, and the SKUincludes multiple characteristics of the product item 30, such as brand,size, or packaging features etc. for more advanced analysis. Similarly,other types of sensors can be linked to the analytic device 20 asnecessary.

In one embodiment, a transaction device 38 is linked to the analyticdevice 20. When a transaction is made involving the person 28 or theproduct item 30, transaction details is sent to the analytic device 20and integrated to the time and location data of the person 28 or theproduct item 30. Some example of transaction details can be a method ofpayment, price, time of transaction, or whether a promotion offer isbeing utilized.

In one embodiment, the analytic device 20 can be synchronized with otheranalytic devices 20 to form a distributed edge network. The data can beprocessed as independent data clusters, and data can be exchangedbetween different analytic devices 20 to form a more complete data set.In one embodiment, the analytic device 20 is set to connect to anexternal database at regular time periods to upload the stored data. Theuploaded data can be used for further analysis such as productrecommendation or targeted advertising.

In one embodiment, the analytic device 20 discards the captured imageafter the time and location data is analyzed and extracted forprotecting privacy and saving storage space.

In one embodiment, the image capturing devices 22 comprises a number offish eye cameras. A fish eye camera allows a larger area to be scannedwith a single device, and while the captured image is distorted to someextent, the deep learning model 24 can compensate the distortion andaccurately determine the location of the target. A fish eye camera canbe used in open area stores for example where installation points may belimited and a larger area needs to be covered by each image capturingdevice 22.

In one embodiment, the deep learning model 24 analyzes the capturedimage for determining demographic information such as age group or sexof the person 28. The demographic information can be used forrecommending products to the person 28.

In an example application of the system of the present invention, ashopping mall has a number of shops and an open area outside the shops.Each shop has its own analytic system of the present invention with ananalytic device 20 and image capturing devices 22. The shopping mallalso installs an analytic system at the open area, and the analyticdevices 20 are all connected to each other to form a distributed edgenetwork. The collected data is sent periodically to a management systemfor statistical analysis for gathering information on customercharacteristics, and targeted information can be sent to a user device,for example customized signage directions to a certain shop, oradvertisement on promotions or products.

The exemplary embodiments of the present invention are described above.It is to be understood that upon reading the above disclosure, oneskilled in the art can change certain details of the present inventionwithout departing from the scope and spirit of the invention, and thescope of protection of the present invention is to be bound by theclaims as set forth below.

What is claimed is:
 1. A visual and geolocation analytic systemincluding: an analytic device; a number of image capturing devicesconnected to said analytic device; wherein said image capturing devicescapture images of an object at a time interval and send said capturedimages to said analytic device; said analytic device comprises a deeplearning model for analyzing said captured images, allowing said objectto be identified and tagged, and allowing a path of movement of saidobject across time to be tracked; wherein said deep learning model istrained with image classification using convolutional neural networksand fine tuned with bounding boxes and image distortions, so that it iscapable of object detection by recognizing distinct attributes withinimages and to tag positional information on said object.
 2. The visualand geolocation analytic system according to claim 1, wherein said deeplearning model of said analytic device comprises an object detectionsystem for recognizing a stock keep- ing unit of a product item.
 3. Thevisual and geolocation analytic system according to claim 1, furthercomprising a transaction device linked to said analytic device, whereinsaid transaction device sends a transaction record involving said objectto said analytic device, said analytic device integrates saidtransaction record into said path of movement of said object.
 4. Thevisual and geolocation analytic system according to claim 1, furthercomprising at least one weight sensor linked to said analytic device,wherein said weight sensor sends a signal to said analytic device when aweight change is detected at a sensed location.
 5. The visual andgeolocation analytic system according to claim 1, wherein said capturedimages are discarded from said analytic device after analyzing, and onlya structured data set containing an identity and said path of movementof said object across time is retained.
 6. The visual and geolocationanalytic system according to claim 1, wherein said image capturingdevices include fish eye cameras.
 7. The visual and geolocation analyticsystem according to claim 1, wherein said analytic device issynchronized with analytic devices of other systems to form adistributed edge network.
 8. The visual and geolocation analytic systemaccording to claim 1, wherein said deep learning model of said analyticdevice comprises a facial recognition system for recognizing a person.9. The visual and geolocation analytic system according to claim 8,wherein said facial recognition system further determines demographicinformation of said person.
 10. A method for visual and geolocationanalysis, comprising the steps of: capturing images of an object at atime interval with a number of image capturing devices; recognizing andtagging said object from said captured images by an analytic devicehaving a deep learning model; extracting time and location data of saidobject from said captured images, and tracking a path of movement ofsaid object across time; wherein said deep learning model is trainedwith image classification using convolutional neural networks and finetuned with bounding boxes and image distortions, so that it is capableof object detection by recognizing distinct attributes within images andto tag positional information on said object.
 11. The method for visualand geolocation analysis according to claim 10, further comprising thesteps of: receiving a transaction record involving said object from atransaction device; and integrating said transaction record into saidpath of movement of said object by said analytic device.
 12. The methodfor visual and geolocation analysis according to claim 10, furthercomprising the step of receiving a signal from a weight sensor when aweight change is detected at a sensed location.
 13. The method forvisual and geolocation analysis according to claim 10, furthercomprising the step of discarding said captured images after analyzing.14. The method for visual and geolocation analysis according to claim10, further comprising the step of synchronizing with other analyticdevices to form a distributed edge network.
 15. A visual and geolocationanalytic system including: an analytic device; a number of imagecapturing devices connected to said analytic device; wherein said imagecapturing devices capture images of an object at a time interval andsend said captured images to said analytic device; said analytic devicecomprises a deep learning model for analyzing said captured images,allowing said object to be identified and tagged, and allowing a path ofmovement of said object across time to be tracked; wherein said deeplearning model is trained with image classification using convolutionalneural networks and fine tuned with bounding boxes and imagedistortions, so that it is capable of object detection by recognizingdistinct attributes within images and to tag positional information onsaid object; wherein said captured images are discarded from saidanalytic device after analyzing, and only a structured data setcontaining an identity and said path of movement of said object acrosstime is retained.